Search CORE

50 research outputs found

A Computationally Efficient Projection-Based Approach for Spatial Generalized Linear Mixed Models

Author: Guan Yawen
Haran Murali
Publication venue
Publication date: 15/01/2018
Field of study

Inference for spatial generalized linear mixed models (SGLMMs) for high-dimensional non-Gaussian spatial data is computationally intensive. The computational challenge is due to the high-dimensional random effects and because Markov chain Monte Carlo (MCMC) algorithms for these models tend to be slow mixing. Moreover, spatial confounding inflates the variance of fixed effect (regression coefficient) estimates. Our approach addresses both the computational and confounding issues by replacing the high-dimensional spatial random effects with a reduced-dimensional representation based on random projections. Standard MCMC algorithms mix well and the reduced-dimensional setting speeds up computations per iteration. We show, via simulated examples, that Bayesian inference for this reduced-dimensional approach works well both in terms of inference as well as prediction, our methods also compare favorably to existing "reduced-rank" approaches. We also apply our methods to two real world data examples, one on bird count data and the other classifying rock types

arXiv.org e-Print Archive

FigShare

Unsupervised Semantic Representation Learning of Scientific Literature Based on Graph Attention Mechanism and Maximum Mutual Information

Author: Gao Hongrui
Guan Zeli
Li Yawen
Liang Meiyu
Publication venue
Publication date: 06/10/2022
Field of study

Since most scientific literature data are unlabeled, this makes unsupervised graph-based semantic representation learning crucial. Therefore, an unsupervised semantic representation learning method of scientific literature based on graph attention mechanism and maximum mutual information (GAMMI) is proposed. By introducing a graph attention mechanism, the weighted summation of nearby node features make the weights of adjacent node features entirely depend on the node features. Depending on the features of the nearby nodes, different weights can be applied to each node in the graph. Therefore, the correlations between vertex features can be better integrated into the model. In addition, an unsupervised graph contrastive learning strategy is proposed to solve the problem of being unlabeled and scalable on large-scale graphs. By comparing the mutual information between the positive and negative local node representations on the latent space and the global graph representation, the graph neural network can capture both local and global information. Experimental results demonstrate competitive performance on various node classification benchmarks, achieving good results and sometimes even surpassing the performance of supervised learning

arXiv.org e-Print Archive

Efficient Partitioning Method of Large-Scale Public Safety Spatio-Temporal Data based on Information Loss Constraints

Author: Gao Jie
Guan Zeli
Li Yawen
Xue Zhe
Publication venue
Publication date: 22/06/2023
Field of study

The storage, management, and application of massive spatio-temporal data are widely applied in various practical scenarios, including public safety. However, due to the unique spatio-temporal distribution characteristics of re-al-world data, most existing methods have limitations in terms of the spatio-temporal proximity of data and load balancing in distributed storage. There-fore, this paper proposes an efficient partitioning method of large-scale public safety spatio-temporal data based on information loss constraints (IFL-LSTP). The IFL-LSTP model specifically targets large-scale spatio-temporal point da-ta by combining the spatio-temporal partitioning module (STPM) with the graph partitioning module (GPM). This approach can significantly reduce the scale of data while maintaining the model's accuracy, in order to improve the partitioning efficiency. It can also ensure the load balancing of distributed storage while maintaining spatio-temporal proximity of the data partitioning results. This method provides a new solution for distributed storage of mas-sive spatio-temporal data. The experimental results on multiple real-world da-tasets demonstrate the effectiveness and superiority of IFL-LSTP

arXiv.org e-Print Archive

Assessing the Impact of Retreat Mechanisms in a Simple Antarctic Ice Sheet Model Using Bayesian Calibration

Author: Forest Chris E.
Guan Yawen
Keller Klaus
Pollard David
Ruckert Kelsey L.
Shaffer Gary
Wong Tony E.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 12/01/2017
Field of study

The response of the Antarctic ice sheet (AIS) to changing climate forcings is an important driver of sea-level changes. Anthropogenic climate change may drive a sizeable AIS tipping point response with subsequent increases in coastal flooding risks. Many studies analyzing flood risks use simple models to project the future responses of AIS and its sea-level contributions. These analyses have provided important new insights, but they are often silent on the effects of potentially important processes such as Marine Ice Sheet Instability (MISI) or Marine Ice Cliff Instability (MICI). These approximations can be well justified and result in more parsimonious and transparent model structures. This raises the question of how this approximation impacts hindcasts and projections. Here, we calibrate a previously published and relatively simple AIS model, which neglects the effects of MICI and regional characteristics, using a combination of observational constraints and a Bayesian inversion method. Specifically, we approximate the effects of missing MICI by comparing our results to those from expert assessments with more realistic models and quantify the bias during the last interglacial when MICI may have been triggered. Our results suggest that the model can approximate the process of MISI and reproduce the projected median melt from some previous expert assessments in the year 2100. Yet, our mean hindcast is roughly 3/4 of the observed data during the last interglacial period and our mean projection is roughly 1/6 and 1/10 of the mean from a model accounting for MICI in the year 2100. These results suggest that missing MICI and/or regional characteristics can lead to a low-bias during warming period AIS melting and hence a potential low-bias in projected sea levels and flood risks.Comment: v1: 16 pages, 4 figures, 7 supplementary files; v2: 15 pages, 4 figures, 7 supplementary files, corrected typos, revised title, updated according to revisions made through publication proces

arXiv.org e-Print Archive

DigitalCommons@University of Nebraska

Directory of Open Access Journals

Copenhagen University Research Information System

PubMed Central

FigShare

Kryging: Geostatistical analysis of large-scale datasets using Krylov subspace methods

Author: Guan Yawen
Majumder Suman
Reich Brian J.
Saibaba Arvind K.
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2022
Field of study

Analyzing massive spatial datasets using a Gaussian process model poses computational challenges. This is a problem prevailing heavily in applications such as environmental modeling, ecology, forestry and environmental health. We present a novel approximate inference methodology that uses profile likelihood and Krylov subspace methods to estimate the spatial covariance parameters and makes spatial predictions with uncertainty quantification for point-referenced spatial data. The proposed method, Kryging, applies for both observations on regular grid and irregularly-spaced observations, and for any Gaussian process with a stationary isotropic (and certain geometrically anisotropic) covariance function, including the popular Matérn covariance family. We make use of the block Toeplitz structure with Toeplitz blocks of the covariance matrix and use fast Fourier transform methods to bypass the computational and memory bottlenecks of approximating log-determinant and matrix-vector products. We perform extensive simulation studies to show the effectiveness of our model by varying sample sizes, spatial parameter values and sampling designs. A real data application is also performed on a dataset consisting of land surface temperature readings taken by the MODIS satellite. Compared to existing methods, the proposed method performs satisfactorily with much less computation time and better scalability

DigitalCommons@University of Nebraska

Kryging: Geostatistical analysis of large-scale datasets using Krylov subspace methods

Author: Guan Yawen
Majumder Suman
Reich Brian J.
Saibaba Arvind K.
Publication venue
Publication date: 06/12/2021
Field of study

Analyzing massive spatial datasets using Gaussian process model poses computational challenges. This is a problem prevailing heavily in applications such as environmental modeling, ecology, forestry and environmental heath. We present a novel approximate inference methodology that uses profile likelihood and Krylov subspace methods to estimate the spatial covariance parameters and makes spatial predictions with uncertainty quantification. The proposed method, Kryging, applies for both observations on regular grid and irregularly-spaced observations, and for any Gaussian process with a stationary covariance function, including the popular \Matern covariance family. We make use of the block Toeplitz structure with Toeplitz blocks of the covariance matrix and use fast Fourier transform methods to alleviate the computational and memory bottlenecks. We perform extensive simulation studies to show the effectiveness of our model by varying sample sizes, spatial parameter values and sampling designs. A real data application is also performed on a dataset consisting of land surface temperature readings taken by the MODIS satellite. Compared to existing methods, the proposed method performs satisfactorily with much less computation time and better scalability

arXiv.org e-Print Archive

DigitalCommons@University of Nebraska